Annealing diffusions in a slowly growing potential 

Pierre-Andre Zitt* 
2nd February 2008 



Abstract 

We consider a continuous analogue of the simulated annealing algorithm in R d , namely the solution 
of the SDE dX t — a{t)dB t — W(X t )dt, where V is a function called potential. We prove a convergence 
result, similar to the one in (Mic91[ , under weaker hypotheses on the potential function. In particular, 
we cover cases where the gradient of the potential goes to zero at infinity. The main idea is to replace the 
Poincare and log-Sobolev inequalities used in |Mic91llHCS87| by weak Poincare inequalities (introduced 
in [RWDlj l. and to estimate constants with measure-capacity criteria. We show that the convergence 
still holds for the "classical" schedule a(t) = c/ln(t), where c is bigger than a constant related to V. 
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Introduction 

The goal of this article is to study a continuous analogue of a discrete optimization algorithm called simulated 
annealing. This algorithm was introduced in 1983 by Kirkpatrick, Gelatt and Vecchi, and aims at finding 
"good" (if not perfect) solutions to complex problems. The crucial idea is to perturb the standard gradient 
descent by a random noise; hopefully this noise will get the process out of traps (local minimas), and help it 
reach the global minimum. The noise is taken relatively big at the beginning, so that the process explores 
the space, and is gradually reduced thereafter. 

The standard case is the discrete case (in time and space) ; here we consider a process on K d in continuous 
time. Note that more complicated state spaces have been studied, see for example |.Ta,c941 UR95I I.Tacflfi] : here 
we will stick to E rf . This "annealing diffusion" process has already been studied by several authors. Hwang, 
Chiang and Sheu f |HCS87| l proved the convergence under quite strong assumptions, using comparisons with 
the associated (ordinary) differential equation and results on the trajectories (estimates of exit times from 
domains, etc.). The result was enhanced by Royer (|Roy89| ). The approach we follow was developed by 
L. Miclo in |Mic92j (and in his doctoral dissertation Mie9l|:. and reduces the problem to the convergence 
of a single quantity, the free energy. Since then, other questions have been asked: speed of convergence, 
choice of a better algorithm etc. (see e.g. the survey [LocQDj ). Let us also note that the "functional 
inequalities" approach has also been used extensively for other (possibly discrete) models, and other closely 
related algorithms (see e.g. DMM99 for a study of a generalized simulated annealing process). 

A common feature of these works on global optimization on R d is the quite strong assumptions they 
require on the growth of the potential. In particular, the norm of the gradient is supposed to go to infinity 
at infinity. These hypotheses are technically useful: they guarantee that, at any fixed temperature, the 
generator has a spectral gap, which in turn gives estimates on the rate of convergence. Let us note that the 
"cooling schedule" (i.e. the choice of the temperature as a function of time) for which the process converges 
is linked with the speed of explosion of the spectral gap, but that it can be read directly on the potential 
(see below the remarks on the constant d*). 

A natural question arises: what happens when the gradient of the potential does not go to infinity, and 
when there is no spectral gap? Do we need to change the cooling schedule to reflect the slow-down of the 
diffusions at fixed temperature, or does the local structure of the potential dictate the optimal schedule? 

Before we answer this question, let us be more precise and give our hypotheses. 

We study the following optimization problem: how to find the minimum of a function V on the space 
R d . To solve this problem, we introduce the following stochastic differential equation: 



dX t =yft@dBt-%W(Xt)dt, 
X a ~ m . 
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The function a will be called temperature, and will be a (deterministic) function of time, decreasing to zero. 

Intuitively, this process is similar to simulated annealing: we perturb a gradient descent by a stochastic 
term whose intensity decreases over time. 

We would like to know if the process finds a point where the global minimum is reached; we will show 
that it does, in a weak sense. 

1 Definition. The annealing process starting from a law mo is said to converge if its law m t at time t 
converges weakly to a measure supported by argminV. In particular, if the global minimum ofV is reached 
in a single point Xq, the process converges if m t goes to a Dirac mass at Xq. 

Let us now recall the result we would like to generalize: this is the main result of [HCS87 Roy89, Mic92 , 
as it appears in jMic92j. 

2 Theorem (L. Miclo). If V satisfies some regularity assumptions, and the following conditions: 

• V ► oo, 

x — >oc 

• |W| > OO, 

x — >oo 

• |VV| — AV is bounded from below, 

then there exists a constant d* such that, for any c > d* , and for <r(t) = c/ln(i), the annealing process 
converges. 

To understand the direction in which we generalize this result, let us note that this theorem applies for 
any potential V which is equal to |a;| Q outside a compact set, whenever a is strictly bigger than 1. It is then 
a quite natural question to ask whether this still holds when a is strictly less than 1. Our hypotheses, which 
we now state, allow us to treat this case. 

Hypothesis 1 (Global minimum). The potential has a unique global minimum, located at the origin and 
V(0) = 0. Moreover, this minimum is non degenerate: HessV(O) is positive definite. 

Hypothesis 2 (Growth at infinity). The potential V goes to infinity at infinity faster than a logarithm: 

3m v > 1, 3C, V(x) > \n{\x\) mv - C. 

Hypothesis 3 (Bounded gradient). The potential V is continuously differentiable, and its gradient is 
bounded: 

||W||oo < OO. 

Hypothesis 4 (Concavity). The Laplacian of V is negative at infinity: there exists a compact set K 
compact such that 

Vx £ K, AV{x) < 0. 

One last hypothesis will be added in section 03 regarding the structure of local minima of V. 
These hypotheses call for a few remarks. 

The first one simplifies the problem at hand: there is only one goal to go after. If the weak limit of the 
equilibrium measures fi a (cf. infra) is known (some results in this direction may be found in [Mic92, Hwa80j), 
the arguments given here should work in the same way. The non-degeneracy hypothesis may be weakened too 
(see e.g. section|2for a slight generalization in d — 1) However, this restriction allows for two simplifications: 
it gives an estimate of the partition function Z a , and avoids more intricate reasonings in the computation 
of the weak inequalities (section . 

The growth hypothesis is not very restrictive. In particular, V may grow like \x\ a with a < 1 (or even 
slower). These cases were not covered in the literature. Let us note that we do not know what happens in 
the limit case (when my = 1, i. e. the tails of the equilibrium measures are polynomial) . 

In the light of the previously known results, the bounded gradient assumption seems less stringent: in 
some sense, we already know what happens when the gradient is big. The hypothesis could probably be 
lifted if we allowed a polynomial growth, or a control by V, but we keep it for the sake of clarity. 

Finally, the condition on AV seems more restrictive. It will only be used in the proof of the moment 
bound (section El. It could probably be replaced by a condition like AV < C|VV| 2 . However, in the 
"natural example" where V(x) = \x\ a at infinity, the Laplacian is indeed negative if a < 1, and this example 
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was one motivation for investigating the problem. Moreover, even this weakened hypothesis would not allow 
the existence of traps at infinity, however shallow they may be. It would be interesting to know what could 
happen if there were such traps: either they have no effect (in the sense that the same cooling schedule may 
be chosen), or they slow down the process too much and destroy the convergence. 

Our principal result is the following. 

3 Theorem. // the potential V satisfies the hypotheses above, there exists a constant d* such that, if we 
choose 



with c > d* , the annealing process converges. 

This result generalizes theorem[2]by allowing more general choices for the potential function. In particular, 
as we will see in the sequel, the equilibrium measures need not satisfy a Poincare inequality. Nonetheless, 
the critical cooling schedule is the same, which contradicts the intuition that the speed was given by the 
Poincare constants. In fact, what seems to prevail is the behavior of V in a compact set, and from a certain 
point of of view, that is precisely what the weak inequalities capture. 

The remainder of the paper is organized in the following way. Firstly, we explain the analytic approach 
of L. Miclo and give the main line of the proof. 

This proof, under our weakened hypotheses, uses weak Poincare inequalities. We will need controls over 
their dependence on temperature: these are established in sections El and respectively in the one- and 
multi-dimensional case. These three sections are the core of the proof of the convergence result. 

The quite technical 0} h section gathers definitions and results about Orlicz norms and weak inequalities. 
Finally, we postpone to the annexes a comparison between functions centered by their mean or by their 
median, a moment bound for the annealing process, and a brief proof of the estimation of the partition 
function. 



1 The convergence of the process (main line of the proof) 
1.1 A differential inequality for the free energy 

Before we describe the main idea, we introduce some notation. Consider the SDE defining the annealing 
diffusion, but with a constant temperature a. The process is then a classical diffusion with a gradient drift. 
The corresponding generator is given by: 

L a :f^ a -M- ^ WV/. 

The measure [i„ defined by 

1 / V\ 

dfl a = TT CX P I d\ 

Za V a J 

is reversible for this process (Z a is a normalization constant) . We will call \i a the instantaneous equilibrium 
measure. 

It's easy to see that, as a goes to zero, the measures \x a concentrate around the global minimum of the 
potential (which is found at the origin by hypothesis). In fact, we even have the following convergence. 

4 Proposition. The measures fi a converge weakly: 

AV ► 5 . 

Moreover, the normalization constant Z a behaves like o~ d / 2 . 
The asymptotic behavior of Z„ is proved in annex IHm 

In order to prove the convergence of the process, we follow the approach of L. Miclo ([Mic92j) and show 
that the relative entropy of the law of the process with respect to its instantaneous equilibrium measure goes 
to zero. 

More precisely, let ft be the density of m t = C(X t ) with respect to the equilibrium measure [if The 
relative entropy (also called free energy) is It = J ft log ftdfit, which can be rewritten as It = Ent Mt (\/7t )■ 
The finiteness of It is established in annex IC~2l We would like to study the evolution of /*; the natural idea 
is to differentiate it. One can justify the following formal computation: 
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5 Proposition (Differentiation of the free energy). The derivative of the free energy is given by: 

Remark. By S^tif) we denote / |V/| 2 d/i t . This is somewhat improper — strictly speaking, this is the 
energy associated with the generator (1/2)A — (1/(2(t))VVV (so we should multiply our energy by a to get 
the "real" one). However, the classical criteria for functional inequalities are written for this form of the 
energy. 

The first term is set aside for the time being, we shall bound it later directly by a function of t. 

Following the classical path leading from functional inequalities to semigroup estimates, we now try to 
control the energy term on the right hand side. 

If the measures [i t satisfied logarithmic Sobolev inequalities, everything would be fine: the energy of \/Yt 
could be controlled by its entropy with respect to /it , and we would get It back on the right hand side of the 
inequality. We would still have to know how the constants in the logarithmic Sobolev inequality depend on 
the small parameter a, and get an upper bound for the first term, but we could get the convergence of It to 
zero. 

Unfortunately, the scaling behavior of the constants in the logarithmic Sobolev inequality (i.e. the way 
they behave when a goes to zero) is not clear. Moreover, this inequality need not hold, and in fact it won't 
under our hypotheses. 

In Miclo's paper, the first difficulty is overcome thanks to a Poincare inequality, weaker than the loga- 
rithmic Sobolev inequality, but for which the constants are well known. However, even this inequality won't 
be satisfied in our case, and we have to find another way. 

Our idea is to consider a still weaker functional inequality, namely a weak Poincare inequality, written 
with an Orlicz norm. Weak Poincare inequalities were introduced by M. Rockner and F.-Y. Wang in [RW01 , 
originally with an L°° norm and the mean of / instead of a median on the right hand side. We will give a 
brief account on weak inequalities and Orlicz norms in section 31 and explain the link between the original 
inequality and the one we use. 

For now, let us just state this inequality. It reads: 

V/,Vr, Var pt (/)<a t (r)^(/) + r||/-m / ||J, (1) 

where m/ is a median of / under \x, and at, a decreasing function of r, is the compensating function. The 
Orlicz norm is not easily tractable, but we will see (cf. lemma, l30j) that it can be bounded by the entropy: 
there exists a C such that, for all positive /, 

||/-m/|| 2 ,<C(M/ 2 )+Ent(/ 2 )). 

At this point, the energy is bounded above by three terms: m(/ 2 ), the entropy of / and its variance. To get 
rid of the variance term, we would like to bound it by entropy-like quantities. To this end we introduce the 
following definition. 

6 Definition. For any probability measure [i and any positive f , we will call pseudo-entropy the quantity: 

Ps-Ent(f) = / /log 2 (e + ) dp. 



With this definition in hand, we can state ( |Mic92j . lemma 4): 
7 Lemma. There exists a Sq such that, for all probability measure fi and all positive f with ^(f 2 ) = 1, 

V5<6 , ~Var M (/) + ASPs-Ent^f 2 ) > Ent Al (/ 2 ). 

Let us put all these inequalities together: we get that for all probability measure fx, if fi satisfies the weak 
Poincare inequality then for all positive / with J f 2 dfi = 1, 

<SEnV(/ 2 ) - 4«S 2 Ps-Ent(/ 2 ) < Var M (/) < a(r)£ p (/, /) + C7rEnt M (/ 2 ) + Cr. 
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This entails a lower bound on the energy: 

/) > — ^5 2 P S -Ent(/ 2 ) ~ C -TT + "T^ - Cr)Ent^(f). 

a(r) a(r) a(r) 

Let us get back into our special case, and take \i = /i*, / = \J~ft- The entropy Ent(/ 2 ) just becomes It, and 
we can plug the inequality back in the differential equation for I t : 



% < [Vx(l- f t )dfr + 8<5 2 ^\p S -Ent(/ t ) + 2Ca(t)^— - 2(6 - Cr)^f-I t 

at a(ty J a t (r) ctt(r) a t (r) 

Since a is non-increasing in time, we may omit the 1 in (1 — ft) in the first term, and since ftdfit = dm t , 

§<Tt(^Ts) I Vd mt + 8S^Ps-Ent(f t ) + 2Ca(t)^ T --2(S-Cr)^-I t (2) 
dt dt \a(t) J J a t (r) a t (r) a t (r) 

Our goal is to obtain a differential inequality involving only It and explicit functions of t, so that we may 
deduce information on the evolution of It. Since a is known, this leaves us with three questions. First, we 
have to obtain controls on J Vdrrit and on the pseudo-entropy — we will get explicit bounds in t. Once this 
is done, we have to estimate the compensating function a t - Finally we must choose r and 6 depending on t 
in a suitable way, so that the inequality on I t is good enough to prove the convergence to zero. 
We now deal with the first problem. 



1.2 Moment bounds and pseudo-entropy 

The first inequality is a moment bound on the value of the potential at time t. The proof is postponed to 
the annexes. 

8 Lemma. Suppose that hypotheses^ and^ hold, and that the initial law mo satisfies: f V p mo(dx) < oo. 
Then there exists an M such that: 

J V p {x)m t (dx) < M<r{t) p ln{t) p (ln\nt) 3p . 

The last result will be used directly, but it also helps us prove the following bound. 

9 Lemma. Suppose that fV 2 dm is finite, and that the cooling schedule has the form: a(t) — c/\n(t), for 
a positive constant c. Then there exists an A such that, for all big enough t, 

Ps-Ent{ft) < Aln(t) 2 (lnln(i)) 6 . 

Proof. We differentiate the quantity under scrutiny, namely Jt = Ps-Ent^ (ft). The following formal 
computation can be justified (cf. |Mic92| l: 

f = -'f J nwivtf*. + 4 t (±) J Me + ajjAj (v - / vw**) **, 

where F(x) = log (a; + e) + log 2 (a; + e). Since F is non decreasing (in x), and a is positive, the first term 
is bounded above by 0. Moreover, since V is positive and 1/a increases, we may also forget the J V(x)dm t 
in the second term. We get: 

i 4(w>)J losle+, ' )Vdm 
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After dividing by 2 J t 2 , the left hand side becomes the derivative of \/~Jt- The right hand side may then be 
bounded (cf. previous lemma): 



dt \a(t) 

The explicit value of a allows us to simplify: 



dt ~ dt \a(t) 

<i(^)VMa(t)Ht))(\nHt)f 



<?/ < VM-OnM*))"" • 
An easy computation shows that the right hand side may be bounded by: 

VM± (m(i)(lnln(i)) 3 ). 



dt 

To conclude the proof, we integrate this inequality between a (fixed and big enough) to and the current 
time t. The constant A naturally depends on the initial law m (through the value of M and through the 
pseudo-entropy at time to). □ 

1.3 From the differential inequality to the convergence of the entropy 

It is now time to get back to our differential inequality and apply the bounds we just derived. We fix a 
logarithmic cooling schedule: 

/ N C 

° {t) " WY 

Recall that we showed (inequality EJ: 

§^(40 / Vd mt + 86 2 ^P S -Ent(f t ) + 2Ca(t)^ T - -2(5-Cr)^-I t 
dt dt \a(t) J J a t (r) a t {r) a t (r) 

We use the moment bound (lemma IBJl to deal with the first term, and lemma El to bound the second one. 
dIt < 4- ( -Tit) Mlnln(t) 3 + 8MS 2 ^-(\n(t)) 2 (\nln(t)) 6 + 2Ca(t)-^ - 2(5 - Cr)^-I t 



dt dt \a(t) ) OL t {r) ot t (r) ott(r) 



We number our four terms and define: 



r 



®=l(4) Mlnln(t)3 ®= 2C ^W) 

© = 8M5 2 -^(ln(i)) 2 (lnln(i)) 6 © = 2(6- Cr)- ^ 



ott(r) a t (r) 
The inequality becomes: 

^=® + © + ©-©/ t (3) 

This last inequality will allow us to prove that the free energy goes to zero. To this end, we use the same 
lemma as L. Miclo: 

10 Lemma. Let I be a positive function, and suppose: 

dlt 



i/t <a(t)-b(t)I(t), 



where a, b are positive functions and satisfy: 
1. J°°b(t) =oo, 
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a(t) t^oo _ 

TTien / goes to zero when t goes to infinity. 

Our goal is now to use the inequality to check the hypotheses of this lemma. We choose S and r as 
follows. 



^* — ln(t) a (lnln(t)) 7 
, r * = C'\n(ty 2 (In ln(t)) 8 



(4) 



This ensures: 



© = Cr t C 

© fc-CVt ~ lnln(i) ^ ' 



© _ AMSf 
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In (t)(lnln(t)) fa - 4Af(5 4 In (T)(lnln(£)) fc 



© St- Cr t 
Two things remain to check: 

^ -> and / © = oo. 

© J 

This is where we need bounds on the weak Poincare inequalities: we have to know how a t behaves for our 
particular choice of r. This is the aim of the following sections, in one or many dimensions. 
In both cases, we will get: 

11 Lemma. There exists a constant d* such that, for all D* > d* , 

fD* 

3C a a t (r t ) < C a exp I — 

For the cooling schedule a(t) = c/ln(i), we get: 

a t (r t ) < C a t D *' c . 

In the one-dimensional case, this follows from theorem 1121 below, and the choice of r t . The multi- 
dimensional case is proved in theorem El an d the discussion that follows it. 

Remark. The approach in the one- and multi-dimensional case will differ slightly. In the former, we prove 
a (full) weak Poincare inequality, i.e. we estimate the whole function at, and then use this estimate at the 
point r t . In the latter, we will only prove a bound on at at r t and disregard the other points. 



We may know get back to our proof. Recall that we have assumed: 

a{t) = wr c>d ^ 

so that we may always pick a D* strictly less than C. 

Let us check the two remaining points. First we must prove that ®/@ converges. Since a is explicit and 
we know a bound on a(r), we see that: 

® _ d ( 1 \ »^i_i_^3.. a *( r t) 



M{\n]n(t)) 6 x 



© dt\a(t)J v x " 2(S-Cr)a(t) 
< M'jln(t) 3 \nln(t)) 10 a t (r t ). 

where M, M' are constants. 

Using the bound on a we just recalled flemma lTT]) . we get: 

(Ti t D '/ c 

}| <M"^((lni) 3 (lnlnt) 10 ) 
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Since c > D*, ®/@ goes to zero, as was claimed. 
Just in the same way, we have, for t big enough: 



at{r t ) 

~ M"(lnt)- 8 (lnln<)- 7 -jr. 

Once more, the condition c> D* guarantees that the integral of this quantity diverges, which was expected. 

This allows us to apply lemma 117)1 and prove that It converges to 0. Thanks to Pinsker's inequality, the 
total variation between m t (law of the process) and fit (the instantaneous equilibrium) converges too. Since 
we already know that fit converges weakly to the Dirac mass 6 , this concludes the proof. 

1.4 Some remarks 

Our theorem immediately raises a few questions. Some of these have already been asked when we discussed 
the hypotheses — equilibrium measures with polynomial tails are not covered, and we do not know what 
happens when there are traps at infinity. 

It would also be interesting to know what happens if we cool faster than the "good" schedule. A priori, 
the process has no reason to converge to the global minimum; intuitively it should freeze in some local 
trap. One could ask if this trap is a good approximation of the global aim. Answering this question seems 
impossible in all generality, one should have to assume much more on the potential function, and on the 
starting point. The "analytic" approach may not be the best suited for this task. 



2 The one-dimensional case 

In this section we treat the case of a one-dimensional potential, for which we derive a weak Poincare inequality 
(more precisely we prove lemma ITDl . 

The major advantage of this case is that, in one dimension, explicit (Hardy-like) criteria are known for 
weak inequalities. Thus we are able to prove a quite general result (the de-coupling of the parameters s and 
a in the weak inequality). This has a small price: we restrict ourselves to potentials that grow like a power 
of x, and do not cover the case V(x) — log(x) a at infinity (for some a > 1). It should be noted that the 
multidimensional argument (c/. next section) may still be used in this logarithmic case. 

Let us write down a few notations. The potential V is a real function, continuously differentiable. For 
any (small) a, we denote by V a the function ^V, and by Z a = J e~ v "^dx the partition function. We 
normalize V a by defining $ CT : <f> a = V a + logZ CT . The equilibrium measure fi a reads: 

dfi„ = — — cxp(— V a )dX = exp(— § a )d\. 

Ax 

We now state our hypotheses on V. We suppose there exists a compact set [K\,K2\ such that the 
following holds. 

Hypothesis U 1 (Behavior near the minimum). In [K\,K^, V is bounded below by and above 
V(Ki) = V(K2) ■ It reaches its minimum only once, at x\. Near this point, V behaves like: 

V{x)^{x-x 1 )\ 

withb> 1. Finally, there exists S such thatV is bijective from [x±,xi+6] onto its image, and from [xi — 5,Xi] 
onto its image. 

This generalizes a little the general assumptions on the minimum: if HessT^ is positive definite at x\, it 
satisfies this hypothesis with 6 = 2. 

Hypothesis U 2 (Behavior outside the compact). Outside the compact, V and \V"\/(V' 2 ) are bounded: 

BCvVxtiKuKi], y^<C v . (5) 
In particular, V has no zero, V decreases before K\ and increases after K 2 . 
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Hypothesis U 3 (The function (3). There exists a function (3 such that, for all x outside the compact, 

13 { v(x) )-vW () 

To apply the result to the annealing diffusion, we need an additional growth condition on (3: 

Hypothesis U 4 (Behavior of near the origin). There exist constants A,C such that, near 0, the 
following holds: 

A 

.1, 



13(a) < C \log( 

Remark. We shall note here that the last two hypotheses hold ifV(x) — x a outside a compact, with a G (0, 1], 
if we choose (3 = C (log(l/s))° (d. ' L RW01, BCRO^j). If v grows like a logarithm to some power, this is 
not true (f3 behaves like a power of s). This explains the small loss of generality we spoke about above. 

We define, for all x > x\, i(x) = m{{V(y),y > x} and s(x) — swp{V(y), y € [xi,a;]}. In the same way, 
i(x) = mf{V(y), y < x} and s(x) — swp{V(y),y £ [x,xi]} for x less than x%. 

Outside [K\,K-2\, we have i — V — s, so s — i is continuous with compact support. We call d* its 
maximum value. 

The main result of this section may now be stated as follows. 

12 Theorem. The measure \i a satisfies a weak Poincare inequality with the L°° norm, with a compensation 
function (3 a defined by: 

f3 a ( s )^Cexp(^j(3( s ), 

where j3 is given by the hypothesis. Similarly, fi a satisfies a weak inequality with an Orlicz norm and the 
modified function a a given by: 



a a (r)^C(3 a (Vexp(-^ 



fVxp ( (3 ^C'exp (~ 



Finally, there exists a constant A such that the following bound holds: 



a a (r) < C exp 



d*\ 1 



a 



..A ■ 



To prove this, we will use a result from Barthe, Cattiaux and Roberto ([BCR05J, theorem 3), which gives 
estimates on the compensating functions for the L°° norm. We will then use capacity-measure criteria to 
derive the result with the Orlicz norm. To state the result we need, we first give some additional notation. 

Let m a will be a median of /x<x, and for all x, 

a[X) ~ (3(Ce-^v)dy) 
B a = sup B a (x). (7) 

x>7~n (T 
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By symmetry, we also define b a (x) and b a for x < m a . 
The result from BCR05J reads: 



13 Theorem. Let (3 : (0, 1) — > M + be non increasing, and B ai b a be defined by JZJ. 
Then jJL a satisfies the following weak Poincare inequality : 



Var^(f) < C a (l(s) / \Vf\ 2 d^ + dosc(f) 2 , 



where C < 12max(6 (T ,i? -). 

Note that their result is actually stronger, since it also gives a lower bound on the optimal constant C in 
terms of some quantities very similar to B a . 

To use this result, we have to bound -B<t(x), and this has to be done uniformly in x. We will split M into 
two domains, and show that, in some sense, our choice of (3 already deals with B a for large x, so that the 
crucial region is near the minimum x\. 

What happens for large x We study the x > K 2 by following the proof of corollary 4 in [BCR05 . 
14 Lemma. For all a, there exists a c a such that: 



Proof. Recall that the same bound holds for V (cf. hypothesis tEJ; we try to carry it over to <& a . 

The behavior of V near its minimum allows us to get an equivalent for Z a using Laplace's method (cf. 
for example [Die68 ); if V ~ (x — xi) b , we get 





(8) 



One may choose c a = 



where C depends only on V . Let us bound the argument in the function /?. 




15 Lemma. For all x > Ki, we have the following inequalities: 




e -3vO) 

M<rO,oc)) < 2 < 3fi a ([x,oo)). 



Proof. For all x > K2 and a small enough (less than l/(2Cy)), the hypothesis on V gives us: 




Therefore: 



/ exp($ g ) V l 9a 
V K ) - 2 



This gives the first result by integration. In a similar way, 



/ exp(-^) y n ^ 3 - 



leads to the second claim. 



□ 
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We are now in a position to bound B a (x). 

gg(a)=M[g - TO))x ^(M[U))) x £ e * g(g) 

< ^ jki / \ x —7 r x / e*-Wdy + 2— — T (by lemma[EJ 

< 2 ^ - . x gW x / e*'^dj/ + 2-—- (bylemmalll 

< f 2 e v " {y) - v ' iK2) dy + — . (because V(a;) > V(ifa)) 



The hypotheses imply that V(y) > V(Kz), whenever \y\ < K 2 . On the other hand, &' a is bounded above by 
C/cr (since V is supposed to be bounded). Finally, 

C 

\/x > K 2 , B a {x) < , 



where C is independent of a. 

What happens in the well The general strategy here is to bound B a {x) by studying only the numerator. 
The denominator can be (very) roughly bounded by (3(1/2) (which does not depend on a). The partition 
function disappears, and we get: 



B a (x) <C e v(y),a dy x / 



e- v ^/°dy. 



We need a bound on V near the median: under our hypotheses, since converges weakly to S Xl , the 
continuity of V in x\ yields (for a small enough): 

VieKn.m,,)], V(x)<d*/4. 

Now we can bound the first integral in the following way: 



f e V(y)/a < , K2 _ k a exp H max ( s ( a ,) j d * 



/4) 



where d*/4 takes care of the case when m a is less than x\. 
We cut the second integral in two parts: 



/> 00 /> K2 r 00 

/ e' v ' (y) dy< I e' v " (y) dy + / 

J X J X J K2 



Since V is strictly increasing after if 2, we may apply Laplace's method to the second term. In the first one, 
we use a rough bound on V: 



£ e- v ^dy < (K 2 - K x ) exp (-^) + Cexp 



Since i(x) is less than V(K 2 ), the second term is less than the first one (up to a constant), and there exists 
C such that: 



Coming back to B a , we get: 



jfV^dy<C"exp 



BJx) < C" exp ( - (max(s(x),d*/4) - i(x)) 
a 

<C"'exp( — 



(T 
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Conclusion : An upper bound on (3 Let us now gather the bounds on B a (x) we derived in the preceding 
paragraphs. 

16 Lemma. There exists a C (independent of a) such that, for all a, 

B a = sup B a (x) < Cexp 

x>m a 

With this result in hand, we may apply Barthe, Cattiaux and Roberto's result (theorem El : this proves 
the first claim of theorem El 

The modified function a a is deduced from (5 a with the help of theorem (see below, in section 0}. 

Finally, the growth hypothesis on (3 (L0J guarantees that, near 0, (3 is bounded by a power of ln(l/s); 
this immediately implies the last result, and concludes the proof. 

3 The weak inequality in any dimension 

We now turn to the proof of the weak inequality (the bound in lemma ITT)) in any dimension. We are going 
to need one more hypothesis on the structure of potential wells, to avoid "pathological" cases. 

After that, we proceed in several steps. First we recall our aim and explain the main lines of the proof. 
During this proof, a certain "path" (in fact, an open set of M d ) will appear. It will be used to derive a 
"capacity-measure" inequality. Eventually, we will go from this inequality to the one we seek, using a result 
from next section. 

3.1 The last hypothesis on the potential 

To write down the last hypothesis we shall make on V, we first need a few more notations. 

For all i £ l d , we call T x the set of paths from x to 0. For each such 7 (7 is a continuous function from 
[0, 1] into R d ), we call ^.(7) the "height" of 7, i.e. the highest value taken by V along 7: 

ft(7) = sup 7( 7 (t)). 
te[o,i] 

Now suppose we try to go from x to while remaining as low as possible (i.e. we try to find a path where 
V is small). There is a minimum price to pay; whatever path we choose, we will necessarily go at least as 
high as: 

h(x) = inf h (7). 

We will call "good paths" the ones that stay below that minimal height: 

7 is good h(^) — h(x). 

A priori, for a given x, a good path from a; to need not exist: it may well be the case that, if one tries to 
find 7 such that ^(7) < h(x) + 1/n, one has to go farther and farther as n grows, and that no finite path 
achieves the infimum bound. 

Finally, the height of the "potential barrier" between x and the global minimum will be called d*(x): 

d*(x) = h{x) - V{x), 

and the height of the biggest barrier will be just d*: 

d* = sup d*(x). 

X 

Hypothesis 5. The potential barriers have a bounded height: 

d* < 00. 

Moreover, each point can reach by a "relatively short" good path. More precisely, there exists a function R 
(a maximal radius), from M. d to K, which satisfies the following conditions: 

• For all x, the ball centered in zero and of radius R(\x\) contains a good path for x: 

I 7 is good. 

• The function R grows like a power of the distance to the origin: 

R(\x\)<c R \x\ d *. 
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3.2 The one-point weak inequality 



As was said before, we will not prove in this section a full weak Poincare inequality, i.e. we will not get 
Q for all r. Instead, we just prove it for a specific value of r, namely the r — r t — (lni)~ 2 (lnlnt)~ 8 (cf. 
equation Since a(t) = c/(\nt), we note that: 

r t >C 



ln(a) 8 
> C'a m , 

for some C, C and a small enough. Therefore, and since at decreases, we may prove an inequality with a m 
instead of rt ■ 

More precisely, we will get: 

17 Theorem. Let m be a real number, strictly smaller than 1 + my, and let D* be a constant, D* > d* . 
Then there exists a C m such that, for all a, the measure \i a satisfies the following one-point weak Poincare 
inequality 

V/,Var^(/)<C m ex P r^ J \Vg\ 2 d^ a + <r m \\f - m/lft. 

where rrif is a median of f under \i a . 
As was noted before, this entails 

atirt) < a t (a m ) < C m exp (D*/a) , 

which is the result of lemma ITT1 



The end of the section is devoted to the proof of the theorem. It can be sketched as follows. 

The idea is to use a capacity-measure criterion restricted to certain sets (large enough sets) . Intuitively, if 
a set A has a large \i a mass, it must contain points near the origin; and these points are the important ones, 
for measuring capacity as well as mass. For these sets, located near the origin, everything should behave as 
in the compact case, and the inequality should depend on a in the same way as when a Poincare inequality 
holds. 

Let us fix D*, strictly bigger than d* . As was just said, we would like to compare the capacity and 
measure of large enough sets: let k > be the minimum mass we will consider (k will depend on a). Let A 
be a Borel set such that: 

fi a {A) > 2/c((j). 

Restricting ourselves to these large sets localizes the problem in some sense. To be more precise, we introduce 
two radii. The first one, r a , is such that: 

Hcr(Br a ) > 1 - K. 

The second one is deduced from it: it is a radius big enough to include good paths (c/. hypothesis 0| starting 
from any point in the small ball B r „ ■ 

R a = R{r a ). 

These two quantities depend on a and k; we will see that, for our choice of k, r a and R a won't grow too 
fast as a goes to zero. 

Let A = A 1 U A", where A' = An B Tr , and A" is the complement set. Since n„{A) > 2k and fi<r(A") < K 
(by definition of r a ), fi<j(A') > k, and: 

(x a (A) = fi a (A') + ^{A") < 2ii„(A'). 

Intuitively, we need only consider the subset A' , because it concentrates enough mass. 

At this point, our set A' may still be very complicated. In particular, it could be scattered all over the 
ball B ra . To avoid this, we will once again restrict ourselves to a subset, trying to keep enough mass in the 
process. 

This is done by cutting B Ta into small cubes. The bound on the gradient of V (hypothesis helps us 
choose a good mesh, such that V does not vary too much inside a little cube. 
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18 Proposition. For all rj, there exists e (depending only on V and rj), such that, on each cube B with 
radius e, 

sup V — mfV < T). 
b b ~ 1 

The parameter rj will be chosen later. 

So we cut B ra into many little cubes of radius e. This requires a certain number of cubes, which we call 
n a . We then have: 

B ra =B 1 UB 2 ...B n<r . (9) 

In the same way, N a will be the number of cubes necessary to cover Bn a . We denote by Ai the intersection 
of A and Bi. We apply the pigeonhole principle to say that one of the Aj's must be large enough: 

3i ,fi a (A la ) > —fi a (A'). 

To sum up our considerations on sets, for each A, we have found a subset A; such that: 

• Ai is a subset of a cube of radius e, 

• A io is not too far from the origin (A io C B Ttj ) 

• Ai is big enough compared to A : ^(A^) > ^-fx a (A). 

In some sense, we need only consider the case when A looks like a ball and is not too far from the origin. 
We are going to see how this can be used to build a certain path between Ai and 0, and from this path, 
deduce a capacity-measure inequality. 



3.3 Building a path and straightening it out 

Recall that our goal is to compare the capacity and the measure of sets, and more precisely to bound the 
capacity from below and the measure from above. 
The capacity is defined by an infimum bound: 



Cap M (A) = inf jy | W| 2 d M , 1 A < f < 1, Msupp /) < ~ 



(10) 



Note that we only define capacities for sets whose measure is less than 1/2. This restriction explains why 
we use function recentered by their median when we deduce functional inequalities from capacity-measure 
criteria. 

Since we seek a bound from below, we consider a function satisfying the conditions, and we try to bound: 

|V/| 2 ^. 

The key idea is to find out a region of M. d which should contribute a lot to this integral. Since the function 
/ equals 1 near A, and near (the measure of its support being less than 1/2), there must be a transition 
between A and 0: this is where the gradient of / appears. Still on the intuitive level, if the integral is to 
be small, we had better make this transition in a region where fi has less mass, i.e. in a zone where V is 
large. This is the reason why we introduced the good paths: to go from A to zero, a large contribution to 
the energy should appear along these good paths. 

To put these ideas on a firm ground, we will build, starting from A (or more precisely from Ai), an open 
set Ca with good regularity properties, and then bound the capacity by integrals over this open set. This 
construction is depicted in figure El 

Once this set is built, we proceed in two steps. First, for all function / satisfying the conditions of (1101 . 



/ |V/|> CT > / |V/| 2 1 Ca ^ 



On the path, we know by design that V is bounded above by V{x* A ) + rj. Indeed, V is less than V{x* A ) 
along 7, and the size e of the cubes has been chosen so that on each cube, the oscillation of V is less than 
rj. Therefore, we may compare our integral with an integral with respect to the Lebesgue measure. 



J I V/| 2 d Mcr > -L exp (-V(x* A ) -in) J \Vf\ 2 l CA d\. (11) 
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The next step is to bound the latter integral on Ca- Our only hypotheses is that / must be 1 on A, and near 
zero. The idea is then to apply a Poincare inequality to compare the energy to a variance. Unfortunately, 
though we know that a Poincare inequality effectively holds under quite general assumptions for a bounded 
domain in R d (this is proved in many textbooks on partial differential equations, see e.g. |Eva98j.p. 275-276), 
the explicit constants and their behaviour when the domain changes is not well known. However, there is a 
case for which we have such explicit estimates, namely the case of convex domains. 

19 Theorem (Poincare inequality for convex domains). Let L be a convex bounded domain in R d . 
Then the Lebesgue measure on L satisfies a Poincare inequality, and the constant can be bounded above using 
only the diameter dh of the domain: 



This theorem is proved e.g. by Payne and Weinberger, and Bebendorf in |PW6f)l fBeb03j . Note that 
other bounds in more complicated cases have been derived (see |CL97j for star-shaped domains, or [Che90j 
for bounds depending on the geometry of the boundary) . 

Note that, by abuse of notation, we use Var for a non-normalized measure. 

In order to use this result, we try to "straighten out" the set Ca- 

We will build a function 4> sending Ca to a tube La- This function will be defined piecewise, on each of 
the little cubes that Ca crosses. Let us denote these cubes as C , . . . C m . It is easy to see that the intersection 
of Ca and one of these cubes can only take a finite number of shapes (up to a rotation and/or translation). 
In d = 2 for example, only two different shapes are possible (either a straight tube or a bended one, see 
figure El- Each of these shapes may be "straightened out" into a tube by a diffeomorphism. We have to be a 
bit careful in choosing these diffeomorphisms <f>j (one for each shape). We will ask two things: they should 
behave like a rigid motion in the neighborhood of the edges (so we may "glue" two transformations together) , 
and their Jacobian matrix should be sufficiently "nice" (the "niceness" needed will be made precise later). 
Such a choice is possible; see the figure for an explanation of a possible way to find such good functions. 

Once this is done, we only have to glue our pieces together. Let us denote the pieces Ca H C, by X^. We 
leave To where it stands, and look at T\. We have seen that it may be straightened into a tube, T[: define 
4> on Ti to be precisely this transformation. Now consider T2: we can straighten it by one of our (j>j, and 
then use a rotation and/or a translation to put it next to T{. Since we have asked that the <f>j should be 
rigid motions near the edges, the two pieces of <j> define a diffeomorphism from T\ U T 2 to the straight tube 
T[ U T' 2 . We may iterate the process and eventually we get a diffeomorphism 4> from Ca to Ca- One can see 
on the figure that a little extra care is needed to deal with the end of the path Ca — however, adding just 
one 4>j to our set of transformations settles the question. 

Remember that our goal is to use the Poincare inequality on the convex set La- For this to work, we 
need to control some quantities related to the map <j). 

20 Proposition. There exists a constant C<f,, which may depend on e but not on a, such that, at every 
point, the Jacobian matrix satisfies 



where Ai(M) is the smallest eigenvalue of the symmetric matrix M. 

Proof. This holds by design of the map <f>. At each point, 4> is the composition of a rigid motion (which has 
no effect on the eigenvalues or the determinant of the Jacobian matrix) , and of one of the <f>j ■ For a given 
4>j, the properties hold: we have designed the 4>j as restrictions of diffeomorphisms on larger sets, so the 
bounds hold by compacity. Since there is a finite number of 4>j , we may choose bounds that do not depend 
on j. This proves that the bounds hold for 0. □ 

We may now give our "straightening" its rigorous form, namely a change of variables. 

21 Proposition. Let U and V be open sets, let <fi be a diffeomorphism from U onto V . If the inequalities 
in the preceding lemma hold with a constant C^,, then for all continuously differ entiable function f on U, we 
have: 




|det(J )| < C , 

AitJ/J^) > Cr , 




where g = f ° 4> 
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Figure 2: Building the path Ca 
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1. We consider a good path starting from the 
center of the cube B io , and going to the origin. 
On this path, V reaches its maximum at some 
x A , and on the colored region, V is bounded 
above by V(x* A ) + rj. 



2. We pick a "path of cubes" from to Bi 
which stays entirely within the colored region. 



Ca 




To T\ — Z3- 



3. Within this path, we draw a smooth tube 
Ca- The intersection of Ca and a given little cube 
may only take a finite numbre of shapes (up to 
a rigid motion); in this 2-dimensional drawing 
for example, we have either a straight tube (Ti) 
or a bended one (T2). For technical reasons, we 
consider two more shapes at the end of the tube 
so that Bi a lies entirely within Ca- 
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Finally, the tube Ca is sent onto La, a convex set for which we have an explicit Poincare inequality. 

This is how the bended tube on the left 



may be straightened. We consider a dif- 
feomorphism which sends the regions be- 
tween dotted lines on one another, and 
ask that it should be a rigid motion on 
the dark regions. Defining the transfor- 
mation on a set (the region between dot- 
ted lines) larger than the tube (the re- 
gion between plain lines) gives compacity 
bounds on the Jacobian. 
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Proof. It's a change of variables. Let us define F by F(x) = |V/| (:r). Then: 

/ |V/| 2 dx= / F(x)dx = [ Fo^ldetJT^dy 
Ju Ju Jv 



> — I Fo<f>- 1 dy. 



□ 



Since f = g o <f>, the gradients are given by: 

(Vj% = (%) x (V<?)«« x) . 
Taking norms, and using the lower bound on the first eigenvalue, we get: 

(IV/I 2 ), = 'VgJjJ+Vg > C^(\Wg\ 2 )^ x) . 

Rewriting this in y variables, 

Fo^(y) = (|V/| V lfe) > C^(\Vg\ 2 ) y . 

Finally, 

/ iv/l 2 > i / |V 3 | 2 . 

Ju °0 Jv 

Putting the last two propositions together, we can show: 
22 Proposition. There exists a C, depending only on e, such that if f satisfies the following conditions: 

1. f is continuously differ entiable from Ca into [0,1], 

2. \{{f = 0}) > h, 

3. \({f = l})>h, 
then 

|V/| 2 dA> ^min(Mi)- 

We recall that N a is the number of balls of radius e needed to cover the big ball Br^ . 

Proof. Suppose / satisfies the hypotheses. Define g = f o 0" 1 as in the preceding proposition. The various 
bounds needed on the Jacobian matrix of <j> are provided by proposition |201 These bounds also imply that g 
must vanish at least on a set of Lebesgue measure C7 Iq, the same being true for the set where g = 1. The 
change of variables has shown: 

J \Vf\ 2 d\>C^ 2 J \Vg\ 2 d\. 
On the right hand side, we can now use the Poincare inequality: 



/ 



V 5 | 2 dA > — J_Var A ( 5 ). 

Cp(La) 



The very purpose of our change of variables was to make the domain convex, so we could make use of theorem 
1191 The constant may therefore be bounded by the square of the diameter of Ca- Since Ca results from 
gluing together at most N a little cubes of radius e, the square of the diameter may be bounded by e 2 -/V 2 . 

We now turn to the variance, and use the information on the sets where g is or 1. We denote by 1' , l[ 
the respective measures of these sets, and by m the mean of g (m £ [0, 1]). Then: 

2;/ , fl „^2;' 



Var( 5 ) = I (g- m) 2 dA > m% + (1 - m)% 

The right hand side is easily shown to be greater than (I'ol'i / (I'q + 1[) ■ The latter is bounded below by half the 
minimum of 1' and l[ (because the numerator is less than 2max(?Q,^)). Since 1' > lo, and the similar 
result holds for l\, 

/|V/| 2 dA> ^mm(Z ,Zi). □ 

J (J 
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We may now prove the measure-capacity inequality we are looking for. Indeed, recall that our aim is to 
bound the capacity of a set A from below by a function of its measure. The previous inequality is almost 
what we want: on the left hand side is (up to a factor, see (1 1 1 f) above) the quantity whose infimum gives the 
capacity (equation itlflll V and on the right hand side Iq and h are measures of some sets. It remains to show 
that these measures may be compared to the measure of A. 



3.4 The measure-capacity inequality 

Let us put together the results from the previous section (equation l|TT)) and proposition |22j 



J |V/| 2 d MCT > J- CX p(-y(^)- ?/ ) J |V/| 2 rfA 



> exp(-T/(x^)-7 ? )minG ,; 1 ), (12) 

where lo, l\ are the Lebesgue measure of the following sets: 

k = A({/ = o}nc A ) h = \({f = i}nC A ). 

To bound /o> we use the fact that / vanishes on a sufficiently large set (as measured by fx a ). Since \i a 
concentrates around 0, / should vanish near the origin. More precisely, for a fixed e, we know that for a 
small enough, the cube centered in and of radius e concentrates 3/4 of the measure. If this cube is labelled 
B , we have: 

M<T ({/ = o}n5 ) > \. 

Since V is non negative, /i CT and A are easily compared. 

7 < /**({/ = 0} n B ) = ±- f 1/=o1b exp(--)dA 
4 Ax J a 



The integral on the right hand side is less than Iq, therefore: 

lo > m = — 



Let us derive a similar bound, mi, for l\. On the cube B io , V > V{xa) — r}, so: 

If V 

Ha(A io ) = — I l Az exp( )dX 

<jr [l Aio eM-^ + V -)dX 
Z a J a a 

1 f -V(x A )+ V \ 

-^ eX H a ) X(A ^ 

Therefore: 

h > A(A l0 ) > mi = Z a exp - |) V*(A io ). (13) 

Since we would like to control min(Zo, h), we now have to compare the two bounds mo and mi. This is 
possible thanks to the following inequality: 

ti>a[Ai ) < — exp e . 



Z a \ a 

If we gather almost all terms on the left hand side, we recognize mi: 

mi < e d . 

Since mo = Z a /A, it holds that mo > Z <y m\t~ d , and since Z a goes to zero, it also holds that mi > Z a m\(T d , 
so that both lo and h may be bounded below by this quantity: 

min(Mi) > cxp (Y^A _ l) ^ Aio ). 
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Going back to 112L we conclude: 

J |V/| 2 d^ > ex P (~ V ( x *a) - V) min(Z , ii) 

^ ^ ex P (^(^) - V ( x a) - H ^(A ). 

By definition of x* A , V(x A ) — V(x* A ) — 2-q > —d* — 2r\ > —D*. On the other hand, A ia was chosen precisely 
because it contained enough of A's mass: fj, a (A ia ) > (2n a )~ 1 ^ a {A). Finally, every function / we can choose 
in the definition of capacity must satisfy: 



where C' e = CC e . Taking the infimum over all possible / finally yields the following result. 

23 Proposition. Let k(ct) be a positive number, less than 1/2. Let n a ,N a be defined as in the discussion 
near equation ® . Then the following bound holds: 

VA,n a (A)>K(a) ^(^^expf^Cap^A). (14) 

3.5 Conclusion 

The bigger part of the proof has now been done; the last thing we need to check is that the number of balls 
n a and N a do not grow too fast as a decreases. Then we will apply theorem EBI to deduce the one-point 
inequality of theorem 1171 from our measure-capacity inequality. 

Recall that we are given a real number m, strictly smaller than 1 + my. Define n{cr) = exp (— ^r). We 
want to find an r a such that the mass of B r<T is greater than 1 — k. For any set A, we may write: 



v 

1 A exp I 



Z 2 * 1 (\ ( V V\ 
= ^ x | lA exp(-|-W. 



If V takes large values on A, we can get a good bound: 



MA) < -7T- CX P s V>2*{A). 



Z a \ 2<r 

We get rid of the ^2a{A) by roughly bounding it by 1. Then we use the growth hypothesis on F Q, with 
A = . In this case: 



We fix an ml S]m, 1 + mv[, and choose: 



inf V > ln(rv) mv . 



2 \ irn — l)/mv x 



r a = exp | ( - 

which ensures: 



x \ (m'-l) 



inf V > 

A 



M#rJ < ^-exp ( -- 



Z fj \ 2a m ' 

The asymptotic behavior of Z a (cf. annex RTT)l implies that Zi a jZ a converges, and since m' > m, 

Ma (Br J < ex P ( - -r 
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for <7 small enough. This shows that r a satisfies the condition we wanted. 

We may now end the proof of the theorem. Coming back to the measure-capacity inequality (1141 . we 
note that R a , n CT and N a all behave like r a to a certain power (for R a we use hypothesis [£j and n a , N a are 
just a number of cubes of fixed radius in the big cubes of side length r CT and R a ). Therefore, there exists a 
C such that 

VA,/x CT (A) > K (a) ^(A)<^exp^^Ca P ^(A). (15) 

The value of r a and the fact that to' — 1 is strictly less than my makes exp(D*/er) the biggest term, so that, 
up to a slight increase of D* , 

WA, n a [A) > k{&) => Ha(A) < cxp (^-\ C&p^ (A). 

This inequality, thanks to theorem [5HI below, implies precisely the one-point weak Poincare inequality we 
claimed in theorem 1 171 

4 A measure-capacity criterion for one-point weak Poincare in- 
equalities 

4.1 Definitions 

In this section we study the interplay between weak Poincare inequalities and measure-capacity inequalities. 
Let us start by recalling exactly what a weak Poincare inequality is. 

24 Definition (M. Rockner and F.Y. Wang, [RWOlJ). Let \xbe a measure and AT be a norm, stronger 
than the L 2 (/j) norm. The measure /U is said to satisfy a weak Poincare inequality for the norm Af if there 
exists a decreasing positive function a, defined on such that: 

V/ E L 2 (n), f such that - 0, Vr > 0, M (/ 2 ) < a(r)S(f, f) + rAf(f) 2 

If this holds, a will be called a compensating function. 

Remark (on means and medians). The original statement on weak Poincare inequalities involves func- 
tions recentred by their mean value fi(f), and an L°° norm. However, the approach by measure- capacity 
inequalities developed in L r BCR05, BCRI works with functions recentred by their median m,f. When the norm 
is the sup norm, it is easy to go from one to the other: the three quantities osc(/), ||/— to/||oo and \\f— m(/)||oo 
are within (universal) bounds of each other. 

Since we need to work with another norm, we will show that we can still go from Af(f — mf) to Af(f — nf) 
( cf. equation l|19H in annexVAlil. 

This is equivalent to the slightly modified definition: 

25 Proposition. A weak Poincare inequality holds if and only if: 

Vr > 0,3c r ,V/ S L 2 M, M/) = ^(f 2 ) < c r S^f \ /) + rAf{}) 2 . (16) 

If the inequality holds for a given couple (r,c r ), we will say that fi satisfies a one-point weak Poincare 
inequality. 

Therefore the weak Poincare inequality holds if and only if a one-point inequality holds for each point r. 

Proof. The only thing to check is that we can deduce the inequality of the definition from HCA) . To each r, 
we associate c r according to ltl6|l . Then we just define a(r) — inf{c s ;s < r). The function a is decreasing. 
Now let / be a function in L 2 and r > 0. For any e, we may find an s < r such that: 

c s < a(r) + e. 

If we apply iflfill with this s, we get (since s < r): 

M(/ 2 ) < c s S(f) + sNiff 

<a{r)£{f)+rN{f) 2 +e£(f). 

Since this is true for any e, we may let it go to zero, and we have found a function a. □ 
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We will be specifically interested in these inequalities for one special norm. We now define this norm and 
recall some of its properties, without proofs. For a short introduction (with the results we need here), see 
e.g. |Alef)4| : for an extensive treatment we refer to |RR91| . 

Let (j>, ip be defined on R + by ip(x) = sclog(l +a;), 4>{x) = ip(x 2 ). For any measurable /, define the Orlicz 
norm (usually called the Luxembourg norm; there is another natural norm on the Orlicz space, but we won't 
need it here) of / to be: 

11/11, = inf|A,yVl£) <1 

Note that, with this definition, ||1||, ^ 1. The set of functions / for which this norm is finite is denoted L^, 
it is a vector space, and it is complete for the Orlicz norm. In the same way, if ip*,<fi* are the convex dual 
functions of ip, 0, we may define the corresponding Orlicz spaces. It is easily seen that for every positive /, 
ll/ 2 IU = ll/IU- The dual functions allow us to state the following Holder-like property. 

26 Proposition (Holder-Orlicz). ///,<? are two measurable functions, respectively in and L^* , then 
fg is in L , and 



I 



fgdu 



<2||/IWMU* 



The constant 2 is necessary because we work with Luxembourg norms. To conclude this account on 
Orlicz norm, we recall here the norm of an indicator function: 

27 Proposition. Let A be a measurable set. Then 1a is in the Orlicz space and: 

\\l A \\ r =4>(ii(A)), 

where ip>(x) — npry^TJ^] ■ Moreover, for all x sufficiently small, we have the following bound: 

>, ^ 2 



log(l/x)- 



Proof. Once again we refer to |Alef)4l[RR91| for the first result. The explicit bound on $ follows easily from 
the bound ip* < xe x and the definition of ip. □ 



4.2 Measure-capacity inequalities for large sets and one-point inequalities 

Here we show the result which was used in the preceding section: if we can compare the measure and the 
capacity of large sets, we can deduce a one-point weak inequality. 

28 Theorem. Suppose that there exists K < 1/2, and a real constant C K such that, for every set A whose 
measure is larger than n, we have: 

Cap^A) > C K /i(A). (17) 
Then [i satisfies the one-point weak Poincare inequality: 

Var M (g) <^r J |V 5 | 2 ^ + kosc 2 ( 5 ), 

where c is universal. We may replace the L°° norm by an Orlicz norm, in which case the inequality reads: 

Var M ( 5 ) < ±- J |V 5 | 2 ^ + $( K )\\g - m g || 2 . 

Remark. Note that if I|17J1 holds for all sets, regardless of their measure, then /i satisfies a (strong) Poincare 
inequality (since we may take n = 0). This is well-known, cf. and references therein. This charac- 

terization of a functional inequality in terms of a relation between measures and capacities of sets is in fact 
more general, and provides a way to compare many functional inequalities. For a detailed account on these 
questions, and links with isoperimetric properties, we refer to 'BCR] (especially section 5). 

Proof. We follow the proof of theorem 2 in |BCRf)5| (which deals with the (full) weak inequality) . 

Let / be a function and m a median for /. We cut the space in half, according to whether / is greater 
than m or not; we denote by f2+, f2_ the two sets. The integral may be written as: 

Var^(/)< f(f-m) 2 dfx= f (f-m) 2 d f i+ f (f-mfdfi. 
J Jn + JQ- 
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We will show how to deal with the leftmost integral, the other one being similar. 

c = inf{i > 0, p(g 2 > t) < k}. 
If c is zero, then p(g > 0) is less than k, and: 

Ksupg 2 in the L°° case, 



9 d ji< 



~~ m \\d> m the Orlicz case, 



so the inequalities we are looking for hold in the half-space 

Thus we need only consider the case where c is strictly positive. By a continuity argument (p will always 
have a density), we can find a set flo such that p(flo) — k and {g 2 > c} C l!o C {g 2 > c}. We fix a p > 1, 
and introduce the level sets flk = {g 2 >-%■}• We decompose the integral over these sets: 



J JQ.Q k>l **^k\^k—i 

< [ g 2 dp + J2^zr(K^k)~ Mfifc-i)) 
JQo k>1 P 



The sum is dealt with thanks to an Abel transform: 



P 

k>l r 



E Pk \ - Pk_ 

pk—1 pk 
k>l F fe>0 F 



fe>i 



This is where we do not follow :BCR05J: since we simply suppose an inequality between capacity and 
measure, we can get rid of the po and write 

fc>l F k>l F 

The rest of the proof follows the same line as in [BCR05 — at this point, we use the measure-capacity 
inequality on each set They are designed to have their measure bigger than k, so that we may apply 
our hypothesis: 

Pk< 77- Cap(fi fe ). 

Now, to bound the capacity from above, we apply the definition with well-chosen functions 



. I, / 9~V C P k ~ 
g k = mm 1, 1 



y/cp k - y/cp- 



k-l 



This entails: 



Pk < J \Vg k \ 2 dp 

p k+1 

O n c(y/p- 1 



k+l 

< Tl ? ^ / \Vgfdp. 



k — 1 



Summing over k, we get: 



We may now choose p; the (non optimal) choice p = 4 gives: 



g 2 dp < [ g 2 dp + H / |V 3 | 2 d M - 
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The only thing left to do is to take care of the integral on fio- This is done with an Holder-like inequality. 
In the Orlicz norm case, for example, we write: 



/ ff 2 d//<2|| ff 2 y|l no ||^ 

JQ.Q 



<2\\(f- m y\\^( K )<2ij(K)\\f-m\\f p . 

thanks to the Holder-Orlicz inequality and the relation between <f> and ip (see the beginning of this section). 

□ 

4.3 Weak inequalities for different norms 

To conclude this section, let us state a corollary to the previous result, and prove that weak Poincare 
inequalities for many different norms are in fact equivalent. Moreover, if a compensating function is known 
for one norm, we can immediately deduce a function for another norm; this result was used in the one 
dimensional case (section where the explicit Hardy- like criteria were known for the L°° norm. 

29 Theorem. Let <j>, ip be two Young functions, with 4>{x) = ip(x 2 ). A measure /i satisfies a weak Poincare 
inequality with the L°° norm if and only if it satisfies one with the Orlicz norm ||-||<£. 

Moreover, if fi is a compensating function for the L°° norm, then the following function may be chosen 
for the Orlicz norm: 

where c is universal (and the same as in the preceding result). 

Proof. First, let us introduce a few notations. We will denote by M-C(k,C(k)) the following comparison 
between measure and capacity: 

WA,fi(A) > k => Cap(A) > C(k)h(A). 

Similarly, PWP(r, C(r),J\f) will denote the one-point weak Poincare inequality for a norm TV with constants 
(r, C(r), and WP(a,jV) will be the (full) weak inequality, with a norm J\f and a compensating function a. 
In the previous section, we showed: 

M-C(k,C(k)) => PWP (k, | 

m-c(k, <?(«)) =>- pwp ^(4^,1 

Going the other way around is easy. Indeed, suppose that PWP(r, C(r), ||-||oc) holds. Let A be a set whose 
measure is less than 1/2, but greater then 4r. Let g be any function which may appear in the definition of 
the capacity of A (cf. 110p ). and let m g be a median of g. Then: 



Var M <? < C r J |V 5 | 2 d/i + ^ll,9 



-m s ||oo- 



Without loss of generality, we suppose that < g < 1, so that the L°° norm is bounded by 1. Moreover, r is 
less than fi(A)/A, and the variance on the left hand side is bounded below by (1/2) min(/i(A), 1/2) > (p(A)/2) 
(by the same argument used previously, during the proof of proposition 12211 . This entails: 

K A ) / ^ /" ITT 12 J , ti A ) 



<C r J \Vg\ 2 dn 



2 - y ' - 4 

This immediately implies the measure capacity inequality M-C(4r, 4/C r ). 

If we now try to derive an inequality with an Orlicz norm starting from one with an L°° norm, we just 
translate them in terms of measure and capacity: 

PWP(r,a,|M|cc.) => M-C(4r,4/a) 

cC 

PWP(2^(4r), —£-). 

If we are looking for a full weak Poincare inequality, we fix an s, and define r = (l/4)-0 _1 (s/2). We may 
then apply PWP(r, j3{r)) to obtain: 

pwp^c/^^j-iu). 

Since s is arbitrary, this concludes the proof. □ 
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A Orlicz norms, entropy and centering 



The proof of weak Poincare inequalities starting from measure-capacity comparisons for an Orlicz norm leads 
us to consider norms of functions recentered by their median. In fact, what one obtains when applying these 
criteria is of the form: 

Var M / 2 </3( S )£(/)+.s||/-m / ||2, 

where my is a median for /. The aim of this section is to bound this term by more tractable quantities (we 
will use an entropy and a moment). 

More precisely we prove the following result: 

30 Lemma. There exists a C such that, for any positive f and any probability measure (m, the following 
holds: 

||/-m / |||<C(Ent At (/ 2 )+3E ;i (/ 2 )). 

The proof is done in several steps, and borrows several arguments from [BG99 . First of all, we get rid 
of the median and replace it by a mean value. 



11/ - m/IU < 11/ - m/IU + IIm/ - m ih 
< II/-m/IU + Im/-"vI' 



(18) 



Let us consider the last term. 

fif -m f = J f(x)dfi - mf = J (f - m/)+d/z - / '(/ - m/)_d/i, 
where the integrals are both positive. The absolute value of the left hand side may then be bounded above: 

|/i/ - to/I < max ^ (/ - m/)+d/i, J (/ - m/)_d/z 
Each of the arguments in the max can be controlled by Holder's inequality. 
(/ - m f ) + dn = / (/- m f )lf >mf dfi < \\f - m f \\ 2 \\lf >mf h 



<^V-m,h 
S^l/-m,|, 
Coming back to l(T8)l . we get: 

11/ - m/IU <\\f-»f\U + W - m f \ < ||/ - m/IU + 



(since /Lt(/ > m f ) < 1/2) 
(c/. |B(]99j . lemma 4.3) 

\f- m AU- 



Since w | < 1, we may put it on the other side to get: 



II/- m/IU <C||/- M/IU 



(19) 



5 ) 1 is universal. 



where C = (1 - ,. s . 

The next step is to bound the Orlicz norm by an entropy. Once again, we use a result from Bobkov and 
Gotze ([EnnSj): 

Il/-M/Il0< ^supEnt^Z + a) 2 ). 

Since we would like to deal only with the entropy of / 2 , we try to compare the entropies of translated 
functions. Rothaus' lemma tells us: 

EnV((/ + a) 2 ) < Ent M (/ 2 ) + 2Var /1 (/), 

where / is the centered function / — [if ' . The only thing left to do is to bound the entropy of the square of 
this centered function. This is done in the following lemma. 
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31 Lemma. Let f be a positive function, and f = f — fj,f ' . Then the following holds: 



Ent M (/ 2 ) <Ent A1 (/ 2 )+ / fdfi. 

Proof. Both sides of the equation are homogeneous (of order two), so we may as well suppose f f 2 dfi = 1. 
We rewrite the left hand side. 



Ent M (n= / riog(/")dM-E/.(r)log(E/.(/ )) 

= [ / 2 log(/ 2 )d M -Var M (/)log(Var M (/)). 



The second term is easily dealt with. Indeed, since J f 2 = 1, Var M / must be between and 1. Since 
x h- ► |xlog(a;)| is bounded by 1/e on this interval, one can write: 

Ent M (/ 2 ) < j (/ 2 log(/ 2 )d/x+i. 

We decompose the integral in two parts, according to whether / is less than 1 or not. 

1 



Ent M (/ 2 ) < / /^og(/^)l |/|<1 d/i+ / / 2 log(/ 2 )l |f|> W M 



< / f 2 log(f 2 )l lfl>1 dfi 



1 



since the first term is less than 0. Now, on the set where |/| exceeds one, / must be above its mean: / is 
indeed positive, and since J f 2 d[i — 1, (if must be in [0, 1]. So \f — fif \ may be greater than 1 only when / 
itself is greater than 1. This shows that, on {|/| > 1}, 

1 < / = / - M/ < /• 
Since x h- > x\og{x) increases on [1, oo), we have: 

EnV(/ 2 )< j / 2 log(/ 2 )l |/>1|( ^+i 



< J / 2 log(/ 2 )l |/>1| ^+i. 



At this point, remark that on {/ > 1}, / 2 log(/ 2 ) is positive, and since 1i/i>i < l/>ij 

EnV(/ 2 )< |/ 2 log(/ 2 )l />1 ^+i 

< Ent(/ 2 ) - J f 2 log(/ 2 )l /<lC ^ + ~ 

< Ent(/ 2 ) + -. 

e 

Since - < 1, the proof is complete. 

Gathering our results, we have shown that: 

\\f-m f \\ 2 <C7||/- M/ || 2 



□ 



<^snpEnt((/ + a) 2 ) 

< ^ (Ent(/ 2 ) + 2Var M (/) 
<^(Ent(/ 2 )+3E,(/ 2 )). 



(inequality lfl9l) l 
(Bobkov and Gotze's lemma) 

(Rothaus' lemma) 
(lemma El 



The last line is precisely the result we claimed in lemma, l3f)l 
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B A moment bound 



In this annex we prove lemma The proof mainly follows the one in Miclo's doctoral dissertation, with a 
few changes to accomodate our hypotheses. 

B.l Outline of the proof 

We need to introduce some notation. 

For e > 0, we denote by C e the generator of the diffusion at fixed temperature e: 

C e = - ^ WV • . 
2 2 

We will need a smooth version of a step function; we call it / and suppose that it satisfies: 

!0 if x < 0, 

exp(-exp(i)) on [0,1], 
1 on[2,oo[. 

We recall the hypotheses on V: 

• It goes to infinity at infinity, 

• its gradient is bounded, and 

• its Laplacian AV is negative for large x. 

Note that, since V is continuous, there must be an R such that AV is negative whenever V(x) > R. 
Finally, let g be an increasing function, going to zero at zero. 

The idea of the proof is that, as time goes by, the value of V at X t has a typical scale, namely g( - g 1 ^^ , 
for a function g to be made precise later, so that when we try to estimate E(V p (X t )), we only have to take 
into account the small values of V. 

More precisely, let p e (-) — f (g(e)V(-) — (R + 1)). This is a smooth approximation of ly>^s_. We may 

— 9(e) 

bound the expectation of V p (X t ): 

nv p (x t )} = nv p p„ {t) (x t )} + e[v*(i - Pu(t) (x t ))] 

<E[^ (t) (X 4 )] + (^) P . (20) 
To bound the first term, we use the explicit expression of the generator. Intuitively, we write, for h t = V p p a n) '■ 



f t (P t h t )=P t £„ (t) k t + P t (f t 



and integrate between two times t and t'. To ensure that everything exists, we use the stopping time 
T k =w£{t,V(X t ) > k}. We get: 

E[h t AT k (X t AT k )] = E[h t 'AT k (X t >/\T k )} 



E[/ £ a(s) (h s )(X s )d S 

Jt'KTu 



r tAT k 

+ E[ / a'(s)g'(a(s))f' (g(<j(s))V(X s ) -(R+ 1)) V p+l (X s )ds. 

Jt'AT k 

Since V is positive, / and g increasing and a decreases, the whole last term is negative. We try to estimate 
the second one, and study £ cr ( s )/i s (X s ). 

32 Lemma. Let us define (p : x i— > a; log 2 (a;). There exists an M and a time t' (which may depend on p and 
on the initial law) such that: 



( M 
Vt>t',Vx, C a{t) (h t )(X t ) <exp 



(*))) 
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We postpone the proof and finish the argument. The inequality dictates the choice of g: g = ln(l/-) 3 
guarantees 

a(t)g(a(t)) 
<p(a(t)g(a(t))) 



ln(*)(lnln(<)) 3 ' 

In 2 (l/ln(<)(lnln(i)) 3 ) _ In 2 (ln(i)(lnm(<)) 3 ) 



ln(t)(lnln(t)) 3 ln(t)(ln ln(t)) 3 

Indeed, the upper bound on the generator then becomes 



C a ( t ){h t )(X t ) < exp ( -— - — ) 



, „, . , (lnln(i)) 3 
<exp -Mln(t)x 



In (ln(f)(lnln(i)) 3 ) 



Since the ratio (lnln(i)) 3 /(ln 2 (ln(i) lnln(i) 3 )) goes to infinity, it eventually exceeds 2/M, so that for t big 
enough, 

C a(f) {h t ){X t ) < exp(-21n(i)). 
Going back to the bound on the expected value we were looking for, the two previous arguments imply: 

exp(-21n(i)). 

We succeeded in making the last integral finite. We can then let K go to infinity, and since t' is fixed, we 
get the existence of a constant M p (which depends on p and on the initial law) such that: 

E[h t {X t )] < M. 

Plugging this back into inequality (1201 yields: 

i? + 3 



The expression of g shows that, for a new constant M: 

nV p (X t )} < M(a(t)ln(t)(lnln(t)) 3 y, 

and the result is proved. 



B.2 An estimate on the generator 

We now turn to the proof of lemma, l32l We have to bound £ e (p e V p )(x), and our first step will be to give a 
more explicit expression of this quantity. We will need the derivatives of p e (x). To alleviate notations, we 
will write y = y(x, e) = g{e)V{x) — (R + 1). 

p e (x)=f(g(e)V(x)-(R+l)) = f(y), 
Vp e (x)=g(e)f(y)VV(x), 
Ap t (x) = g(eff'(y)\VV\ 2 + g(e)f'(y)AV. 

The quantity we would like to estimate is 

£ t {p t VV){x) = Pe £ e V p (x) + e(Vp e ,VV p )(x) + V p £ ePe (x) 

We consider three cases, according to the value of V(x)g(e). 



V is small: V(x)g(e) £ [0, R + 1] On this interval, p e vanishes, so £ t {pe) is zero. 



27 



B 



V is large. Let A be a strictly positive real, to be fixed later on. We consider the case where V(x)g(e) e 
[R + 1 + A, oo), which may be rewritten as: y G [A, oo). We develop the expression of C e (p e V p ). 

C e (PeV p )(x) =PeC € V p (x)+eg(e)f'(y) X pV^I W| 2 + V* fi e Ap e - ~<Vp e , W>) . 

We compute the derivatives of p e and put together the terms involving |VF| 2 . 

C e (p £ Vn(x) = PeW(x) + (eg(e)f(y)pV p - 1 + V p Qe 5 (e) 2 /" (y) - |W| 2 

+ i efl ( e )/'(y)Ay 
= A + B + C. 

Since V x g(e) > R, V > R. We already noted that R may be chosen so that, if V is bigger than R, AV is 
less than zero, and this makes the third term C negative. The term B can be rewritten as: 

eg{e)f'{y)pV p - 1 + V p (leg{eff"{y) - ^(e)/'(y)) ) |W| 2 

= w 9 (e) ((7-5) f'(y) + l^)f"(y)) |w| 2 . (21) 

We add another condition on /: it should be concave when y is near 2 (e.g. on [§,2]). On [A, 3A/2], /"//' 
is bounded — let M be a bound. This entails: 

Vy > A, f"(y) < Mfy). 

Coming back to B, we deduce: 

B<^ + ^-l)f(y) 9 (e)V^Vf. 

The term between brackets is negative, uniformly in x as soon as e is small enough. 
Finally, the first term A — £ e V p is also negative: 

4 = |A(V*)-i<W,V(V*)> 

= I (p(p- l)V» -1 |VV| a +pV*- 1 AV r ) - -V^WI 3 
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Once more, the term between brackets is negative when e is small. To conclude, for any A, there exists an 
eo such that: 

Ve<eo,Vx, V(x)g(e)>R+l + X => C e { Pe V p )<0. 

V is of the order of R/g(e). This last case is that where g(e)V(x) € [R + 1, R + 1 + A]. Let us reuse the 
decomposition C e (p e V p ) = A + B + C from the previous paragraph. The same reasoning applies for A and 
C, and they are both negative, so it suffices to get a bound on B. From jH} : 

B = {{v - 1) f,{y) + r 9(e)/ " (2/) ) ^ p i w ' 2 - 

If we choose R sufficiently big and e small enough, the quantity between brackets in front of f'(y) is less 
than 1/4. 

B < (--J'(y) + leg{e)f"{y)\ g(e)V p \X7V\ 2 . 



4' w/ 2 

Recall that / = exp(— t), where r(y) = exp(l/y). This implies: 

B < Qr'/ + \eg{e){-r"f + (r') 2 /)) s(e)V* |W| S 

< \ (\r'f + eg{e)(r'{y)ff{y)\g{e)V p \VV\ 2 
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Define h e — \t' f + eg(e)T ,2 f. We study it by differentiating: 

K^{^r"-\^ + 2e 9 {e)r'r"-eg{ey^ f. 
The explicit expression of r ensures: 

3AVye[0,A] < r"(y) < \r l2 {y). 
This A does not depend on e. This can be used to bound h' e from below: 

K{y) > (-\r' 2 {v) + \eg{e)T'{yf - eg(e)r'(yA f(y) 



> [-\-\^)r'{y])r'{yff{y). 



Let 2/1, e be the solution of the equation: —1 — eg(e)T'(y) = 0. When e is small, j/i.e will be less than A, 
and the monotonicity of r' will give: 

Vy < yi, e , h'M > 0. 

Similarly, h' e can be bounded above: 



K < Qr 2 (y) - \r'\y) eg(ey 3 (y^ f(y) 

< (-^- eg{e y { y)y> { yff { y). 



Now, let j/2,e be the root of — | — eg{e)T'{y) = 0. Once more, when e is small, 2/2, e falls within [0, A]. We 
deduce: 

Vye[y 2 ,e,\],K(y)<0. 

We now know the h e increases on [0, yi, e ], and decreases on [2/2,0 A], so that its maximum must be reached 
somewhere between these two points. More precisely, whenever e is less than some eo, it holds that 

3y e e [yi,e,y2,e],Vy e [0,A], h e (y) < h e (y e ). 
The bounds on y e , the fact that r decreases and the equations defining 2/1, e ,2/2.6 allow us to conclude: 

Vy < A, h e (y) < Qr'(y e ) + e.g(e)r'(y e ) 2 ^ /(y £ ) 

< (ir'( y2 , e ) + e .g( e )r'( yi , e ) 2 ) /(y 2 , e ) 

^(-l4w + ^)) /(2/2 ' e) 
* ^) /( ^ } - 

It remains to estimate /(j/2,e) = exp(— T(7/ 2 ,e))- Since 2/2, e is defined as a solution of an equation involving 
t', we would like to compare r and t' . The explicit expression of r easily implies: 

ln(|r'(y)|)=ln(y- 2 ) + i>i 



therefore: 



ny) = y \t (y)\ > 



ln 2 (|r'(y)|) 

Applying this for y = 2/2,0 for which |t'(2/)| = 3/(8e</(e)), entails: 

3 



r(V2,e) > 



> 



8e 5 (e)ln 2 (8e 5 (e)/3) 
3 

8e 5 (e)ln 2 (e 5 (e))' 
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Turning back to /, and defining (p : x i— > xln (x), and M — 3/8, we have: 

/ M 

f{V2.e) = exp(-r(y 2) e)) < exp I ~ ^7^7^ 

We now come back to the upper bound on B, and plug in the last equation. 

2 e s(e) V ^( £ 5(e))/ 

Since we suppose that V(x)g(e) belongs to [R+l,R + 2], we may bound V p by g(e)~ p . We also supposed 
that W is bounded, so that there exists an M' such that: 

M> ( M 

B < — —— exp 



eg(e)P "V <p{eg(e)) 
the expc 

B < M" exp 



Up to a slight change of the constant M in the exponential, we may neglect the pre-exponential term and 
write: 

M 



This concludes the proof. 



C Regularity results and estimates on the process 
C.l An equivalent of the partition function 

We recall here Laplace's method, which enable us to study the asymptotic behaviour of the partition function, 
i.e. the constant Z a = J exp(— V/a)dx. 

33 Theorem. LetV be a function from M. d toM., satisfying hypotheses^\and\^(V has a unique, well behaved, 
global minimum, and V goes to infinity at infinity rapidly enough). Then Z a exists, and the following holds: 

1 fa\ d / 2 



Vdet HessV V2tt 

To prove this classical result, we cut the integral in two parts, the main one (near the origin) and a 
remainder. Before we proceed, let us remark that, up to a change of coordinates, we may as well suppose 
that Hess(V)o is a diagonal matrix, and we have Taylor's formula: 



where e(x) goes to zero at 0. We choose an r such that, on B = [— r, r] d , e(x) < j(inf A,) x \- 

Let us begin by the negligible part, outside of B. Since V goes to infinity, and is the unique global 
minimum, there exists an rj > such that V(x) > rj outside B. We introduce an exp(— V) in the integral 
(the growth hypothesis makes it integrable), and use this bound: 



/ exp(~V/a)dx = / exp(-V)&qp(-(l/a-l)V(x))dx 

Jx^B Jx<£B 

< / exp(— V)dxexp (— (1/er — l)rj) 

Jx£B 

< Z x exp (-(1/a- 1)77). 

Let us turn to the main term. We divide it by o d l 2 (so that we only have to find a limit). We change 
variables and use x — tfia- (y) defined by Xi — yi \J a/Xi. 

a- d/2 J exp(-V/a)dx! • • • dx n = a- d/2 J l xeB exp ^-i ^ ^-x 2 + e(x) ^ a;? J dx 

= ^=ff= / i^^esexp y~ + e(^(j/))J dy. 
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The function inside the integral converges pointwise to exp(— J^Vi) wnen ® g° es to zero (because 4> a {y) 
goes to zero for a fixed y). It is bounded from above by the integrable function exp(— j YlVi> an d we ma y 
apply Lebesgue's dominated convergence: 

a- d/2 I exp(-V/a)dxx---dxr, - 



With the bound on the remainder, this gives the equivalent of Z a . 
C.2 Finiteness of the entropy and regularity 

We begin by proving that the relative entropy I t is finite. To do this, we study directly the explicit density, 
which we know thanks to a Girsanov transform. We follow a proof from |Roy99| , with a few minor changes 
to deal with the non-homogeneity in time. 

Recall that the process X is defined by the following SDE: 



dX t = yMtjdBt - ~VV(X t )dt. 
If we define a new reference martingale M t = J Q \J a(s)dB s , we may define X as the solution to the SDE: 

dX t = dM t - -VV(X t )dt. 

Note that M t is just a Brownian motion under a (deterministic) change of time — if we define r(t) = 
/q a(s)ds, M T -i(t) is a Brownian motion. To find the density of the law of X t with respect to its equilibrium 
measure /it, we decompose it in three terms: 

dC(X t ) _ dC(X t ) dCM t dX 
d/Jit dC(M t ) d\ d/it 

To compute the first term, we use the (trajectorial) density of X with respect to M, which is given by 
Girsanov's theorem: 

/If If 1 IWI 2 

F = cxp f -- J VV(M s )dM s --J L_L( Ms )d(M) s 

= expf-- / VV{M s )dM s - - [ \W\ 2 (M s )a(s)ds 



V 2, o J0 
To get rid of the martingale term in the exponential, we apply Ito's formula to V and the martingale M: 

V(Af t ) = V(a:) + / VV(M s )dM s + \ f AV(M s )d(M) s . 
Jo * Jo 

The functional F may thus be rewritten: 

F = cxp (lv(x) - \v(M t ) + J q (\AV(Ms) ~ gAF(M s )) a(s)ds S j . 
The three densities we are looking for are: 

|(g)(M«) = /(M«)= E [F W , ) ] 

d £ M t r \ / , « _ ( ( x - vf 



dX -(y) = exp(-2« t (y)) - (2^{t)y d l 2 cxp (- 
— (y) = ^ (t) exp^-— j. 



We take the product of these terms; the last two quantities may be put into the conditional expectation, so 
that the density we are looking for (say G) may be written as: 



G(M t ) = Z CT(t) E 
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Let us now define 7 : x 1— » a;log(x), and start to study It. By definition, It = J l(G(y))d^ t (y)- Since G is 
best expressed as a conditional expectation, we rewrite It: 



It =E 



7(G(M t )) 



AC(Mt) 



= E 



7 (G(M t ))- 



V(t) 



■ exp 



" <r(t) 



2v t [M t 



(22) 



Since 7 is convex, we may apply Jensen's conditional inequality to 7(G(M t )), and develop 7: 



j(G(M t )) < E 
< E 



ZuF exp 



V" 



V 



2u t log Z CT + log F + - - 2u t |F {t} 



Multiply both sides by exp(— V/cr + 2v t ), and take the expected value; the left hand side becomes I t (thanks 
to \T2i ). the conditioning disappears and we get: 



I* <E 



F j \ogZ a + logF + - 2«t(M t ; 



Recall that F is a density, so that E[F] = 1, and we may take the constant Z a out of the expectation. We 
add and substract (2/cr) log(F) inside the integral — this will help us get rid of the term V(M t )/a: 



It < \og(Z a ) - (2/ a - 1)E [F log F] + E 



F{ 2 logF+ n^_ 2ut(Mt) 

a a 



Since xlogx is bounded below, and 2/cr — 1 is positive, the second term is bounded from above (for any 
finite time t). The same is true for the first term. The only thing to check is that the last term is finite; let 
us call this term A. Since F is given by an exponential, A is given by: 



.4 = E 



1 ( -V(x) + -1 
a 4a{t) Jo 



(2AV - I W| 2 ) (M s )a(s)ds ~ 2v t (M t ) 



Let us consider the quantity between brackets. The first term is finite and does not depend on M t . The 
integral is bounded above by something also independant of M t (indeed, 2AV — |V1^| 2 is uniformly bounded 
from above, because AV is negative outside a compact set). The only thing left to check is that: 

E [F(—2vt(Mt))] <oo. 

We have already seen the explicit value of v t : 



exp{-2v t {y)) = {2nT{t))- d/2 exp 
Taking logarithms, we see that: 



(y - 

2r(t) 
{y-xf 



-2« t ( 1 /) = --log(27rr(t)) . 

Since the last term is positive, this quantity is bounded from above by something which does not depend on 
y. Therefore, E[— F x (2v t (M t ))] is finite. This concludes the proof. 
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